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Abstract 

Hidden  services  were  deployed  on  the  Tor 
anonymous  communication  network  in  2004.  An¬ 
nounced  properties  include  sen’er  resistance  to  dis¬ 
tributed  DoS.  Both  the  EFF  and  Reporters  Without 
Borders  have  issued  guides  that  describe  using  hid¬ 
den  services  via  Tor  to  protect  the  safety  of  dissi¬ 
dents  as  well  as  to  resist  censorship. 

We  present  fast  and  cheap  attacks  that  reveal  the  lo¬ 
cation  of  a  hidden  sen’er.  Using  a  single  hostile  Tor 
node  we  have  located  deployed  hidden  servers  in 
a  matter  of  minutes.  Although  we  examine  hidden 
sendees  over  Tor,  our  results  apply  to  any  client  us¬ 
ing  a  variety  of  anonymity  networks.  In  fact,  these 
are  the  first  actual  intersection  attacks  on  any  de¬ 
ployed  public  network:  thus  confirming  general  ex¬ 
pectations  from  prior  theory  and  simulation. 

We  recommend  changes  to  route  selection  design 
and  implementation  for  Tor.  These  changes  require 
no  operational  increase  in  network  overhead  and 
are  simple  to  make;  but  they  prevent  the  attacks  we 
have  demonstrated.  They  have  been  implemented. 


1  Introduction 

Tor  is  a  distributed  low-latency  anonymous 
communication  network  developed  by  the  Naval 
Research  Laboratory  and  the  Free  Haven  Project. 
It  is  currently  the  largest  anonymity  network  in  ex¬ 
istence,  with  about  450  server  nodes  around  the 


world  at  the  time  of  writing.  It  is  popular  and  highly 
recommended:  it  was  rated  one  of  the  hundred  best 
products  of  2005  by  PC  World.  Since  2004  Tor  has 
also  been  used  to  underly  services  offered  from  hid¬ 
den  locations.  These  were  introduced  fUl  as  resis¬ 
tant  to  distributed  DoS  since  they  were  designed 
to  require  a  DDoS  attack  on  the  entire  Tor  net¬ 
work  in  order  to  attack  a  hidden  server.  Hidden 
servers  have  also  been  recommended  for  preserving 
the  anonymity  of  the  service  offerer  and  to  resist 
censorship.  Specifically  Undergroundmedia.org 
has  published  a  guide  to  “Torcasting”  (anonymity¬ 
preserving  and  censorship-resistant  podcasting). 
And  both  the  Electronic  Frontier  Foundation  and 
Reporters  Without  Borders  have  issued  guides  that 
describe  using  hidden  services  via  Tor  to  protect  the 
safety  of  dissidents  as  well  as  to  resist  censorship. 
There  have  been  several  recent  cases  in  the  news  in 
which  anonymous  bloggers  have  or  have  not  been 
exposed  and  have  or  have  not  lost  jobs,  etc.,  as  a  re¬ 
sult,  depending  on  the  policy  of  their  ISP,  the  inter¬ 
pretation  of  laws  by  various  courts,  and  numerous 
other  factors.  Recommendations  for  a  technology 
to  protect  anonymous  bloggers  and  other  publish¬ 
ers,  regardless  of  legal  protection,  would  thus  seem 
to  be  timely  and  encouraging. 

The  Tor  developers  are  careful,  however,  to  warn 
against  using  Tor  in  critical  situations:  upon  startup 
the  Tor  client  announces,  “This  is  experimental 
software.  Do  not  rely  on  it  for  strong  anonymity.” 
Nonetheless,  with  increasing  high-profile  recom¬ 
mendations  to  use  Tor’s  hidden  services  for  appli¬ 
cations  such  as  those  above,  it  is  important  to  assess 
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the  protection  they  afford.  In  this  paper  we  demon¬ 
strate  attacks  (not  simulations)  on  the  deployed  Tor 
network  that  reveal  the  location  of  a  hidden  server. 
The  attacks  are  cheap  and  fast:  they  use  only  a  sin¬ 
gle  hostile  Tor  node  and  require  from  only  minutes 
to  a  few  hours  to  locate  a  hidden  server. 

Although  we  examined  hidden  services  on  Tor, 
our  results  are  not  limited  in  principle  to  either  hid¬ 
den  services  or  to  Tor.  They  should  apply  to  the 
hidden  service  design  even  if  run  on  another  under¬ 
lying  anonymity  network  and  should  apply  to  other 
clients  using  an  anonymity  network,  not  just  to  hid¬ 
den  servers. 

We  believe  that  ours  are  the  first  attacks  that  lo¬ 
cate  hidden  servers,  whether  on  hidden  services  on 
a  deployed  network  or  in  simulation.  Also,  while 
there  have  been  simulations  and  analytic  results,  we 
believe  ours  are  the  first  published  intersection  at¬ 
tacks  carried  out  on  a  deployed  anonymity  network. 

In  Section  |2]  we  review  previous  related  work. 
In  Section^]  we  describe  the  design  of  Tor’s  hidden 
services.  In  Section  0  we  present  various  attacks 
and  the  experimental  results  from  running  them. 
In  Section  |5]  we  describe  various  countermeasures 
that  might  be  taken  to  our  attacks  and  the  effec¬ 
tiveness  of  them.  We  also  describe  an  implementa¬ 
tion  feature  of  Tor  that  our  experiments  uncovered 
and  how  to  change  it  to  better  resist  the  attacks. 
In  Section  |3  we  conclude  with  recommendations 
for  simple-to-implement  design  changes  to  hidden 
services.  These  also  should  not  add  to  the  number 
of  hops  or  otherwise  increase  overhead  to  the  de¬ 
sign,  but  they  should  resist  our  attacks.  We  have 
discussed  both  the  implementation  feature  we  un¬ 
covered  and  our  recommended  design  changes  with 
the  Tor  developers.  As  a  result,  the  latest  version  of 
Tor  is  resistant  to  the  attacks  we  present  herein.  Fi¬ 
nally,  we  discuss  open  problems  and  future  work. 

2  Previous  Work  on  Hiding  Services 
and  Anonymity 

The  earliest  reference  we  can  find  to  a  system 
that  hides  the  location  of  a  service  from  those  using 
it  is  Ross  Anderson’s  Eternity  Service  Q-  Therein 
it  is  suggested  that  servers  hold  encrypted  files,  and 


these  hies  are  to  be  accessed  by  anonymous  com¬ 
munication  to  prevent  uncovering  of  the  location 
of  a  server  from  which  the  hie  is  being  retrieved. 
Early  presentations  of  onion  routing  from  the  same 
era  described  the  use  of  onion  routing  to  hide  the 
location  of  an  automated  classihcation  downgrader 
so  that  users  of  the  service  would  not  be  able  to  at¬ 
tack  it.  Earlier  still,  Roger  Needham  noted  the  fun¬ 
damental  connection  between  anonymity  and  the 
inability  to  selectively  deny  service  |[T9lil20l.  which 
was  one  of  the  motivating  ideas  in  the  Eternity  Ser¬ 
vice.  The  idea  of  hiding  the  location  of  a  docu¬ 
ment  (or  encrypted  fragment  of  a  document)  also 
underlies  many  censorship-resistant  publishing  de¬ 
signs  such  as  Free  Haven  ED  and  Tangier  l28l . 

Anonymous  communication  networks  were  in¬ 
troduced  by  David  Chaum  0.  He  described  a  net¬ 
work  that  distributes  trust  across  multiple  nodes 
that  carry  the  communication.  The  design  is  of 
a  public-key-based,  high-latency  anonymous  com¬ 
munication  network  such  as  might  be  appropriate 
for  email.  It  is  not  for  use  in  bidirectional,  low- 
latency  communication,  such  as  web  traffic,  chat, 
remote  login,  etc.  Low-latency  communication 
anonymity  was  introduced  for  ISDN  t22l  ,  but  made 
to  anonymize  within  a  group  of  users  exchanging 
fixed  and  equal  bandwidth  with  a  local  telephone 
switch  rather  than  anonymizing  within  an  Internet¬ 
wide  group  with  diverse  bandwidth  needs  such  as 
occur  in  the  just  mentioned  applications.  The  oldest 
anonymous  communication  system  for  web  traffic 
is  probably  the  Anonymizer  (3-  Unlike  the  Chaum 
design,  all  traffic  passes  through  a  single  proxy, 
making  it  a  single  point  of  failure  and/or  attack  in 
many  ways.  Also  unlike  Chaum  mixes,  it  does  not 
actually  delay  and  mix  traffic.  Traffic  is  processed 
FIFO.  The  Anonymizer  is  also  probably  one  of 
the  most  widely  used  anonymization  systems:  they 
claim  to  have  millions  of  users. 

The  first  published,  as  well  as  the  first  deployed, 
distributed  system  for  low-latency  anonymous  In¬ 
ternet  communication  was  onion  routing  M  in 
1996,  followed  by  the  Freedom  Network  0  from 
1999  to  2001 .  The  current  version  of  onion  routing. 
Tor  1131.  was  deployed  in  late  2003,  and  hidden  ser¬ 
vices  using  Tor  were  deployed  in  early  2004. 
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All  of  these  low-latency  anonymity  systems 
work  by  proxying  communication  through  multi¬ 
ple  hops;  at  each  hop  the  communication  changes 
its  appearance  by  adding  or  removing  a  layer  of  en¬ 
cryption  (depending  on  whether  it  is  traveling  from 
the  circuit  originator  to  responder  or  vice  versa). 
They  all  use  public  key  cryptography  to  distribute 
session  keys  to  the  nodes  along  a  route,  thus  es¬ 
tablishing  a  circuit.  Each  session  key  is  shared  be¬ 
tween  the  circuit  initiator  (client)  and  the  one  node 
that  was  given  the  key  in  establishing  the  circuit. 
Data  that  passes  along  the  circuit  uses  these  ses¬ 
sion  keys.  Both  Freedom  and  Tor  have  a  default 
circuit  length  of  three  nodes.  For  more  details  con¬ 
sult  the  above  cited  work.  The  Java  Anon  Proxy 
(JAP)/Web  MIXes  0  is  another  popular  system  for 
diffused  low-latency  anonymity.  However,  unlike 
the  others  mentioned  here,  it  works  by  mixing  and 
by  diffusing  only  trust  and  jurisdiction.  It  does  not 
hide  where  communication  enters  and  leaves  the 
network.  All  communication  that  enters  at  one  lo¬ 
cation  leaves  together  (now  mixed)  at  another  loca¬ 
tion.  As  such  it  is  not  directly  amenable  to  the  hid¬ 
den  service  design  to  be  described  presently.  JAP 
has  been  deployed  since  2000. 

Hidden  services  in  Tor,  as  described  in  the  next 
section  and  in  EL  rely  on  a  rendezvous  server, 
which  mates  anonymous  circuits  from  two  princi¬ 
pals  so  that  each  relies  only  on  himself  to  build 
a  secure  circuit.  The  first  published  design  for  a 
rendezvous  service  was  for  anonymous  ISDN  tele¬ 
phony  |[22j  rather  than  Internet  communication.  As 
such  it  had  very  different  assumptions  and  require¬ 
ments  from  the  rendezvous  servers  we  describe, 
some  of  which  we  have  already  noted  above.  A  ren¬ 
dezvous  server  for  IRC  chat  was  mentioned  in  QD; 
however,  the  first  detailed  design  for  a  rendezvous 
server  for  Internet  communication  was  by  Gold¬ 
berg  CCD-  It  differs  in  many  ways  from  rendezvous 
servers  as  used  by  Tor’s  hidden  services,  but  we 
will  not  discuss  Goldberg’s  design  further  here. 

There  is  much  literature  on  attacking  anony¬ 
mous  communication  Q.  Rather  than  single  out 
any  of  it  here,  we  cite  the  relevant  prior  literature 
at  appropriate  points  below.  The  current  paper  is 
the  first  to  focus  specifically  on  attacks  for  locating 


hidden  services. 

3  Location-hidden  Services  in  Tor 

One  of  the  major  vulnerabilities  for  a  hidden  ser¬ 
vice  in  Tor  is  the  server’s  selection  of  the  first  and 
last  node  in  the  communication  path.  To  a  first  ap¬ 
proximation,  if  an  adversary  can  watch  the  edges  of 
a  Tor  circuit,  then  she  can  confirm  who  is  commu¬ 
nicating.  This  is  because  the  low-latency  require¬ 
ments  make  it  easy  to  confirm  the  timing  signature 
of  traffic  flowing  (in  both  directions)  over  the  cir¬ 
cuit.  This  is  true  whether  the  adversary  controls 
the  Tor  nodes  at  the  edges  of  the  circuit  or  is  just 
observing  the  links  from  those  nodes  to  the  initia¬ 
tor  and  responder.  Actually,  this  vulnerability  has 
always  been  alleged  and  assumed  but  never  previ¬ 
ously  demonstrated.  A  byproduct  of  our  analysis  of 
hidden  services  is  that  we  experimentally  corrobo¬ 
rate  this  traffic  confirmation  on  Tor  circuits.  For 
hidden  services,  this  means  that  the  service  is  vul¬ 
nerable  in  every  communication  path  it  sets  up  with 
a  client  if  a  member  of  the  path  can  determine  it  is 
being  used  by  a  hidden  service  and  that  it  is  the  first 
node  in  the  path. 

In  order  to  see  how  our  attacks  that  locate  hid¬ 
den  servers  are  done  we  need  to  describe  how  the 
hidden  service  communication  works.  Fig.[I]shows 
a  normal  setup  of  this  communication  channel. 

In  the  current  implementation  of  Tor,  a  con¬ 
nection  to  a  hidden  service  involves  five  important 
nodes  in  addition  to  the  nodes  used  for  basic  anony¬ 
mous  communication  over  Tor. 

•  HS,  the  Hidden  Server  offering  some  kind  of 
(hidden)  service  to  the  users  of  the  Tor  net¬ 
work,  e.g.  web  pages,  mail  accounts,  login 
service,  etc. 

•  C,  the  client  connecting  to  the  Hidden  Server. 

•  DS,  a  Directory  Server  containing  information 
about  the  Tor  network  nodes  and  used  as  the 
point  of  contact  for  information  on  where  to 
contact  hidden  services. 

•  RP,  the  Rendezvous  Point  is  the  only  node  in 
the  data  tunnel  that  is  known  to  both  sides. 
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Figure  1.  Normal  use  of  hidden  ser¬ 
vices  and  rendezvous  servers 


•  IP,  the  Introduction  Point  where  the  Hidden 
Server  is  listening  for  connections  to  the  hid¬ 
den  service. 

A  normal  setup  of  communication  between  a 
client  and  a  hidden  service  is  done  as  shown 
in  Fig.  [2  All  the  displayed  message  flows  are 
anonymized,  i.e.,  they  are  routed  through  several 
anonymizing  nodes  on  their  path  towards  the  other 
end,  as  described  in  Section  |3  Every  arrow  and 
connection  in  the  figure  represents  an  anonymous 
channel  consisting  of  at  least  two  or  more  interme¬ 
diate  nodes.  (Hereafter,  we  use  ‘node’  to  refer  ex¬ 
clusively  to  nodes  of  the  underlying  anonymization 
network,  sometimes  also  called  ‘server  nodes’.  Al¬ 
though  we  are  considering  the  Tor  network  specif¬ 
ically,  the  setup  would  apply  as  well  if  some  other 
anonymizing  network  were  used  to  underly  the 
hidden  service  protocol.  The  only  exceptions  are 
C  and  HS,  which  may  be  anonymization  nodes 
or  they  may  be  merely  clients  external  to  the 
anonymization  network.) 

First  the  Hidden  Server  connects  (1)  to  a  node 
in  the  Tor  network  and  asks  if  it  is  OK  for  the  node 
to  act  as  an  Introduction  Point  for  his  service.  If 
the  node  accepts,  we  keep  the  circuit  open  and  con¬ 
tinue;  otherwise  HS  tries  another  node  until  suc¬ 
cessful.  These  connections  are  kept  open  forever, 
i.e.,  until  one  of  the  nodes  restarts  or  decides  to 


pull  it  downO  Next,  the  Hidden  Server  contacts  (2) 
the  Directory  Server  and  asks  it  to  publish  the  con¬ 
tact  information  of  its  hidden  service.  The  hidden 
service  is  now  ready  to  receive  connection  requests 
from  clients. 

In  order  to  retrieve  data  from  the  service  the 
Client  connects  (3)  to  DS  and  asks  for  the  contact 
information  of  the  identified  service  and  retrieves 
it  if  it  exists  (including  the  addresses  of  Introduc¬ 
tion  Points).  There  can  be  multiple  Introduction 
Points  per  service.  The  Client  then  selects  a  node 
in  the  network  to  act  as  a  Rendezvous  Point,  con¬ 
nects  (4)  to  it  and  asks  it  to  listen  for  connections 
from  a  hidden  service  on  C’s  behalf.  The  Client  re¬ 
peats  this  until  a  Rendezvous  Point  has  accepted, 
and  then  contacts  (5)  the  Introduction  Point  and 
asks  it  to  forward  the  information  about  the  selected 
RP0  The  Introduction  Point  forwards  (6)  this  mes¬ 
sage  to  the  Hidden  Server  who  determines  whether 
to  connect  to  the  Rendezvous  Point  or  not0  If  OK, 
the  Hidden  Server  connects  (7)  to  RP  and  asks  to 
be  connected  to  the  waiting  rendezvous  circuit,  and 
RP  then  forwards  (8)  this  connection  request  to  the 
Client. 

Now  RP  can  start  passing  data  between  the  two 
connections  and  the  result  is  an  anonymous  data 
tunnel  (9)  from  C  to  HS  through  RP. 

From  this  we  observe  the  following  facts  about 
the  nodes  in  the  network: 

•  C  does  not  know  the  location  (IP  address)  of 
HS,  but  knows  the  location  of  RP; 

•  HS  does  not  know  the  location  of  C,  but  knows 
the  location  of  RP; 

•  RP  does  not  know  the  location  of  either  C  or 
HS,  and  he  knows  neither  the  service  he  is 
serving  nor  the  content  of  the  messages  re¬ 
layed  through  him; 

•  there  are  multiple  (currently  three)  nodes  be¬ 
tween  HS  and  RP  and  two  nodes  between  C 

1  In  Tor,  any  node  in  a  circuit  can  initiate  a  circuit  teardown. 

-Optionally,  this  could  include  authentication  information 
for  the  service  to  determine  from  whom  to  accept  connections. 

'This  flow  is  over  the  same  anonymous  circuit  as  (1),  simi¬ 
larly  for  (4)  and  (8). 
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Figure  2.  Vulnerable  location  of  At¬ 
tacker  in  communication  channel  to 
the  Hidden  Server 


and  RP,  to  hide  traffic  and  create  a  degree  of 
anonymity  on  both  ends;  and 

•  any  member  of  the  network  which  claims  to 
offer  stability  can  be  used  by  HS  to  form  an 
anonymous  tunnel  to  RP,  including  C  if  it  is 
a  node  in  the  anonymization  network.  This  is 
the  basis  of  our  attacks. 

4  Attacks  and  Experimental  Results 

We  have  done  experiments  using  multiple  attack 
methods  in  order  to  determine  the  IP  address  of  the 
Hidden  Server.  We  will  here  first  describe  the  setup 
of  the  experiment  and  then  four  attack  methods. 
The  attacks  can  be  carried  out  by  an  adversary  that 
controls  merely  a  single  node  in  the  network.  Since 
anyone  can  run  a  Tor  node  simply  by  volunteering, 
this  is  trivial.  (In  fact  the  adversary  need  only  run 
a  “middleman”  node,  which  never  lets  circuits  exit 
the  anonymization  network.  The  burden  of  running 
a  middleman  node  is  typically  less  than  that  of  run¬ 
ning  an  exit  node.  Of  the  roughly  250  nodes  in  the 
Tor  network  at  the  time  the  experiments  were  done 
only  about  100  allowed  exit  to  port  80.)  At  the  end 
we  describe  an  accelerated  attack  using  two  com¬ 
promised  nodes. 

Fig  .|2]shows  the  scenarios  that  an  attacker,  here¬ 
after  Alice,  wants  to  achieve  in  connections  to  the 
Hidden  Server.  Alice  controls  the  Client  and  one 
node.  Her  goal  is  to  control  Node  1  of  the  circuit. 
Certain  circuits  will  yield  a  match  of  traffic  pattern 
with  what  is  expected  given  when  C  sends  to  and 
receives  from  HS.  Alice  will  look  for  such  pattern 


matches  among  all  active  circuits  through  the  node 
she  owns.  If  she  finds  a  match,  then  her  node  has 
been  made  part  of  the  circuit  between  the  Hidden 
Server  and  the  Rendezvous  Point  as  Node  1,  2  or  3. 
From  this  she  will  be  able  to  determine  a  few  facts. 
First  she  will  know  when  she  has  the  node  closest  to 
RP  (Node  3)  since  she  knows  RP’s  IP  address,  and 
she  can  easily  abandon  the  circuit  and  attack  again. 
Second,  if  her  node  has  an  unknown  IP  address  on 
both  sides  of  the  matching  circuit,  she  knows  she  is 
either  Node  1  connected  directly  to  HS  or  Node  2 
in  the  circuit.  This  enables  her  to  use  timing  or  sta¬ 
tistical  methods  to  determine  her  position  as  will  be 
described  later. 

We  will  continue  sampling  data  until  we  have 
enough  to  determine  when  Alice  is  connecting  to 
the  hidden  service  as  Node  1  in  the  circuit  to¬ 
wards  RP,  at  which  point  we  will  know  the  Hidden 
Server’s  IP  address. 

Our  attack  description  is  obviously  based  on 
hidden  services  as  deployed  on  Tor;  however,  the 
basic  approach  will  identify  a  client  of  a  low- 
latency,  free -route  anonymity  network,  not  just  hid¬ 
den  severs  using  Tor.  These  attacks  should  work  on 
networks  such  as  Freedom  (8]  or  Crowds  E3>  de¬ 
spite  their  many  differences  from  Tor.  For  systems 
such  as  Web  MIXes  0,  it  is  difficult  to  briefly  say 
anything  about  either  what  a  hidden  service  design 
over  such  a  system  would  look  like  or  about  the 
relation  to  our  attacks.  On  the  one  hand,  becom¬ 
ing  a  node  in  the  network  is  tightly  controlled,  and 
all  circuits  are  through  cascades  (shared  uniform 
fixed  routes).  Thus,  our  attacks  would  simply  not 
be  possible.  On  the  other  hand,  much  of  the  point 
of  the  attacks  is  to  determine  the  point  where  con¬ 
nections  enter  and  leave  the  anonymity  network.  In 
Web  MIXes,  this  information  is  given,  so  there  is 
no  need  to  attack  to  obtain  it. 

4.1  Experimental  Setup 

Our  experiments  were  conducted  using  two  dif¬ 
ferent  hidden  services  running  at  client  nodes  con¬ 
necting  to  the  Tor  network,  one  in  Europe  and  one 
in  the  US.  The  services  offered  a  couple  of  web 
pages  and  images,  which  were  pulled  down  in  dif- 
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ferent  ways  and  with  different  timing  patterns.  The 
documents  and  images  varied  in  size  from  2KB  to 
120KB.  The  connection  from  the  Client  to  the  Hid¬ 
den  Server  was  done  through  a  random  Rendezvous 
Point  (cf.  Section  14. 6>.  and  the  connection  from 
the  Client  to  the  Rendezvous  Point  was  shortened 
down  to  a  path  length  of  one.  (This  will  be  de¬ 
scribed  more  fully  presently). 

The  Hidden  Service  was  not  offered,  or  known 
to,  any  other  node  in  the  network  except  the  direc¬ 
tory  service.  Only  the  Client  knew  about  how  to 
contact  the  service,  so  that  all  contact  to  and  from 
the  Hidden  Server  was  either  caused  by  the  Client, 
or  by  the  Hidden  Server  preparing  to  operate  (mak¬ 
ing  circuits  and  downloading  new  updates  from  the 
Directory  Servers).  This  is  not  a  limitation  on  the 
implications  of  the  experimental  results  for  pub¬ 
licly  known  and  accessed  hidden  services:  the  tim¬ 
ings  of  data  are  done  with  high  enough  precision 
so  the  possibility  of  two  identical  patterns  from  the 
same  service  routing  through  the  same  node  at  the 
exact  same  time  is  negligible^  No  hidden  server  is 
likely  to  get  two  specially  designed  requests  (like 
ours)  from  distinct  clients  and  respond  to  them  at 
the  exact  same  time.  Thus  a  false  positive  in  our 
timing  analysis  is  highly  unlikely  (Sectionl4.2t. 

The  Client  computer  was  also  announced  as  a 
middleman  node,  i.e.  not  having  connections  out 
of  the  anonymity  network,  and  this  node  is  where 
all  Alice’s  sampling  of  data  takes  place.  By  using 
the  node  both  as  a  server  inside  the  network  and  as 
the  Client  asking  for  the  web  pages  from  the  Hid¬ 
den  Server,  the  attacker  is  able  to  get  precise  timing 
without  having  to  externally  synchronize  the  time 
with  another  node.  This  server  node  in  the  Tor  net¬ 
work  had  to  use  a  logging  mechanism  when  sam¬ 
pling  the  active  circuits  during  the  attacks.  In  order 
to  avoid  reference  to  the  correct  IP  address  during 
the  timing  analysis  we  converted  the  IP  addresses 
by  use  of  a  simple  prefix  preserving  scheme.  If  we 
were  to  use  permanent  logging  of  data,  we  would 


4One  reason  for  not  doing  the  experiment  on  a  publicly 
known  server  in  the  Tor  network,  is  of  course  the  possible  le¬ 
gal  implications.  In  addition,  not  wanting  to  cause  harm  to 
the  project  and  its  participants,  we  avoided  announcements  until 
there  were  countermeasures  available  and  deployed. 


use  a  better  and  more  secure  pseudonomizing  IP 
logging  scheme  EH- 

The  attacker  must  also  make  some  minor 
changes  to  the  application  code  running  at  the 
Client  node  in  order  to  enable  and  strengthen  the 
attacks: 

•  Alice’s  Client  will  connect  directly,  i.e.  in  one 
hop,  to  the  Rendezvous  Point  to  shorten  the 
path  and  latency  of  traffic  between  the  Client 
and  the  Hidden  Server,  thereby  making  it  eas¬ 
ier  to  set  up  and  correlate  the  traffic  patterns. 

•  Alice’s  Client  will  tear  down  the  circuit  to  a 
Hidden  Server  after  each  pattern  is  success¬ 
fully  communicated.  This  will  disable  reuse 
of  circuits  and  force  the  construction  of  a  new 
circuit  on  the  next  connection  request. 

•  In  addition  to  being  the  Client,  Alice  is  also 
running  as  a  server  middleman  node  partici¬ 
pating  in  the  network  and  carrying  traffic  for 
the  other  nodes.  She  will  maintain  a  list  of 
active  circuits  (routed  connections)  and  try  to 
correlate  the  generated  circuit  data  with  all  the 
other  circuits  to  find  out  if  she  is  carrying  the 
same  traffic  data  as  both  Client  and  as  a  server 
node. 

•  Alice’s  server  node  will  report  a  false  higher 
uptime  and  the  maximum  network  bandwidth 
to  the  directory  server  in  order  for  other  nodes 
to  trust  it  for  their  circuits.  This  is  still  possible 
as  there  is  (yet)  no  method  for  keeping  reliable 
track  of  uptime  at  the  different  servers. 

Once  this  is  implemented,  Alice  is  ready  to  use 
the  methods  of  attack  described  below. 

4.2  Timing  Analysis 

The  attacker  uses  the  logged  timing  data  and  di¬ 
rection  information  from  the  generated  data  set  and 
the  sampled  data  set  (from  each  circuit  active  in  that 
period  of  time)  to  accomplish  two  different  things: 

1.  Positively  identify  that  Alice’s  node  is  made  a 
part  of  a  circuit;  and 
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Figure  3.  Example  of  data  sets,  match¬ 
ing  and  response  times 


2.  If  (1)  is  true;  determine  at  which  position  in 
the  circuit  she  is  located. 

To  identify  if  the  generated  data  is  found  within 
one  of  the  sampled  data  sets,  Alice  is  faced  with  a 
comparison  of  one  sampled  data  set,  the  generated 
set  done  by  the  Client,  to  all  of  the  sampled  data 
sets  done  by  the  server.  For  all  connections  to  the 
Hidden  Server  there  were  a  few  hundred  circuits 
active  at  Alice’s  Tor  node  during  each  of  the  sample 
periods. 

Our  match  confirmation  is  an  extended  version 
of  the  packet  counting  attack  described  by  Serjan- 
tov  and  Sewell  [243.  In  addition  to  basic  counting 
of  cells,  we  also  make  use  of  precise  timing  infor¬ 
mation  of  when  cells  were  received  and  transmit¬ 
ted,  and  the  direction  of  each  individual  cell  pass¬ 
ing  in  order  to  determine  a  circuit  match.  An  exam¬ 
ple  is  depicted  in  Fig.0  Alice  uses  the  direction  of 
traffic  in  addition  to  the  timing  of  the  cells’  arrival 
to  match  all  outgoing  and  incoming  traffic  in  the 
generated  data  set.  Notice  that  there  is  also  noise 
occurring  in  the  sampled  data  set.  We  compare  our 
known  data  to  one  other  specific  set  at  a  time,  and 
our  algorithm  only  checks  if  the  generated  data  may 
be  a  part  of  the  sampled  data.  Therefore,  it  makes 
no  estimate  of  how  probable  the  match  is. 

Alice  is  also  able  to  separate  when  there  is  a  sin¬ 
gle  match  and  when  there  are  multiple  matches  in 
a  data  set.  There  is  a  potential  for  multiple  matches 
in  a  set,  for  example,  if  a  circuit  is  carrying  lots  of 
traffic  in  both  directions,  we  will  probably  have  a 
timing  “match”  due  to  the  data  load.  In  the  attack 
Alice  knows  that  only  one  attack  circuit  is  set  up 


at  a  time,  and  each  attack  circuit  is  set  up  for  only 
Alice’s  requests  at  that  time.  So,  she  can  use  the 
amount  of  traffic  relayed  through  the  attack  circuit 
as  a  parameter.  The  small  overhead  and  extra  infor¬ 
mation  from  setting  up  the  tunnelled  connections 
etc.,  should  not  be  more  than  a  few  cells,  some¬ 
thing  our  experiments  confirm.  Therefore  the  at¬ 
tacker  may  discard  the  samples  that  are  more  than 
a  few  percent  larger  than  the  generated  set. 

Multiple  matches  could  also  be  a  possible  result 
in  a  future  scenario  where  the  circuits  may  be  used 
to  carry  data  for  different  clients.  In  this  case  the 
attacker  must  try  to  retrieve  a  list  of  all  possible 
matches  of  the  generated  data  within  the  sampled 
set  and  should  then  be  able  to  use  correlation  tech¬ 
niques  on  the  timing  data  to  calculate  a  probability 
of  the  best  match.  We  have  not  tested  this  part  as 
it  would  require  major  changes  in  functionality  for 
the  deployed  Tor  network. 

4.3  Service  Location  Attack 

First  we  look  at  two  different  situations,  the 
Server  Scenario  and  the  Client  Scenario,  based  on 
whether  the  hidden  service  is  located  on  a  node 
within  the  anonymity  network,  or  on  a  client  us¬ 
ing  the  network  but  not  participating  as  a  network 
node. 

The  Client  Scenario  is  most  often  used  when  it  is 
desired  that  HS  not  be  listed  in  the  directory  service 
as  a  participating  server  node  of  the  network.  An¬ 
other  reason  for  this  scenario  is  that  the  user  may  be 
unable  to  set  up  a  node  directly  reachable  from  the 
Internet  (e.g.,  it  must  be  located  behind  a  firewall) 
but  still  wants  to  offer  his  service. 

The  Server  Scenario  is  most  often  used  to  hide 
the  service  traffic  within  all  the  other  traffic  running 
through  the  server  node.  This  is  often  regarded  as 
a  reasonable  effort  to  improve  the  cover  of  traffic 
originating  at  the  node. 

A  problem  with  the  Server  Scenario  is  that  it  is 
possible  to  correlate  information  about  availability 
of  a  service  and  the  availability  of  the  nodes  listed 
in  the  Directory  Service.  E.g.  we  poll  each  listed 

5  Actually  the  overhead  is  normally  less  than  10  cells,  but  the 
extra  margin  has  not  given  any  false  positives  yet. 
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Table  1.  Experimental  results  of  our  attacks. 


Sample 

Time  to 

Circuits 

Matched 

Largest 

Second 

time 

first  match 

completed 

circuits 

single  IP 

largest 

Server  1 

7.8h 

15  min 

676 

37 

46% 

5% 

Server  1 

6.8h 

3  min 

432 

26 

54% 

7% 

Server  2 

4.9h 

28  min 

447 

31 

71% 

3% 

Server  2 

10. 6h 

3  min 

990 

56 

54% 

7% 

server  every  five  minutes  and  correlate  the  lists  of 
active  servers  when  we  are  able  and  unable  to  con¬ 
tact  the  hidden  service. 

Our  attack  is  based  on  the  availability  of  a  di¬ 
rectory  service  containing  a  public  list  of  all  server 
nodes  in  the  anonymity  network.  From  this  list  the 
attacker  will  immediately  be  able  to  tell  the  differ¬ 
ence  between  connections  from  mere  clients  and 
connections  from  network  nodes.  If  a  service  is  lo¬ 
cated  at  a  client  outside  the  anonymizing  network, 
Alice  will  know  both  this  and  the  client’s  IP  ad¬ 
dress,  as  soon  as  she  has  a  positive  match  in  the 
timing  analysis  (Sectionf4.2>  on  a  connection  orig¬ 
inating  from  outside  of  the  currently  listed  server 
nodes.  There  is  no  other  way  the  hidden  service  can 
communicate  with  the  attacker’s  node  from  outside 
the  Tor  network  unless  this  is  the  actual  location  of 
the  hidden  service,  or  its  point  of  contact,  e.g.,  a 
firewall  hiding  internal  addresses,  an  IPSec  tunnel 
endpoint,  etc. 

Experimental  Results:  Our  results  confirmed 
this  by  a  simple  matching  of  the  IP  addresses  in  the 
sampled  circuits  against  the  list  of  known  servers. 
Both  of  our  Hidden  Servers  were  run  on  Client 
nodes  and  were  easily  confirmed  as  the  source  of 
the  service.  The  time  for  the  attack  until  success¬ 
ful  identification  of  the  IP  address  in  the  four  tests 
of  the  experiment  are  shown  in  Table  0  under  the 
column  ’’Time  to  first  match”. 

So,  if  the  hidden  service  is  located  at  a  client 
of  the  anonymity  network  an  attacker  will  find  it 
in  a  matter  of  minutes  using  only  one  node  in  the 
network,  but  if  the  service  is  located  at  a  network 
node  we  will  have  to  use  another  method. 


4.4  The  Predecessor  Attack 

The  current  implementation  of  the  network  is 
vulnerable  to  the  predecessor  attack  OH.  This  is 
a  form  of  intersection  attack.  Since  intersection  at¬ 
tacks  treat  the  intervening  anonymity  network  as  a 
black  box,  they  are  a  threat  to  any  anonymity  net¬ 
work.  Like  other  intersection  attacks,  the  predeces¬ 
sor  attack  has  been  shown  to  be  devastating  in  the¬ 
ory  and  simulation  against  various  anonymity  net¬ 
works  but  has  never  before  been  demonstrated  on  a 
live  network.  Roughly,  the  predecessor  attack  looks 
at  repeated  connections  suspected  to  be  to  (from) 
the  same  correspondent  and  looks  at  intersections 
of  predecessor  nodes  to  see  which  occurs  most  of¬ 
ten.  Our  use  of  this  attack  is  based  on  the  assump¬ 
tion  that  the  attacker  is  able  to  positively  identify 
the  actual  streams  of  data  to  and  from  the  client  in 
other  circuits,  e.g.  by  using  the  Timing  Analysis 
described  in  Section  l4~2l 

In  the  case  of  Hidden  Servers  and  using  our  sce¬ 
nario  of  attack,  the  Predecessor  Attack  becomes 
trivial.  Alice  can  now  make  statistics  of  the  IP  ad¬ 
dresses  that  contacted  the  server  in  the  cases  where 
a  positive  traffic-pattern  match  was  found.  By  se¬ 
lecting  only  circuits  where  there  has  been  a  match, 
and  using  an  m  node  path  towards  RP,  one  single 
IP  address  will  occur  in  around  —  of  these  connec- 

m 

tions  when  HS  is  selecting  its  first  node.  The  at¬ 
tacker  will  then  easily  identify  the  IP  address  of  the 
Hidden  Server  as  long  as  m  is  significantly  smaller 
than  the  number  of  nodes  in  the  network. 

Experimental  Results:  Our  results  confirmed 
the  theory  of  the  predecessor  paper  ED.  Sorting 
out  the  possible  circuits  based  on  timing  informa- 
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tion  and  then  statistically  examining  the  connecting 
IP  addresses  we  immediately  found  the  expected 
results  from  the  predecessor  attack.  In  every  tesfj 
we  found  that  around  50%,  or  more,  of  all  connec¬ 
tions  identified  as  a  part  of  the  stream  were  made 
from  a  single  IP  address,  as  shown  under  “Largest 
single  IP”  in  Table  flfl 

From  the  experimental  results  we  can  also  con¬ 
clude  that  we  need  far  less  data  to  pinpoint  the  lo¬ 
cation  of  the  Hidden  Server  than  we  gathered.  A 
rough  estimate  is  that  within  the  order  of  an  hour  or 
two  we  should  have  a  positive  match  of  the  location 
of  the  hidden  service  using  the  predecessor  attack. 

4.5  Distance  Attack 

If  there  is  no  information  in  IP  address  statistics 
(e.g.,  due  to  mixing  of  traffic  or  using  other  coun¬ 
termeasures),  an  attacker  must  use  other  techniques 
to  locate  the  hidden  service. 

When  Alice  has  a  dataset  that  matches  the  gen¬ 
erated  set  of  data,  she  can  look  at  the  response 
times  in  the  communication  with  the  service.  The 
attacker  times  the  periods  where  the  sampled  data 
switches  from  outgoing  to  incoming  traffic,  round 
trip  time,  enabling  the  calculation  of  a  rough  esti¬ 
mate  measuring  the  distance  to  the  Hidden  Server. 
These  periods  are  marked  in  the  example  in  Fig.0 
with  D  for  the  Client’s  response  times  and  d  for  the 
round  trip  time  at  the  participating  node.  By  group¬ 
ing  nodes  based  on  measured  round  trip  times,  the 
attacker  is  able  to  find  some  groups  of  nodes  closer 
to  the  Hidden  Server  than  others. 

Experimental  Results:  Our  results  confirmed 
the  assumptions  of  the  distance  attack.  Using  the 
data  from  the  experiment  we  could  see  a  clear  cor¬ 
relation  between  the  response  times  and  the  dis¬ 
tance  from  the  Hidden  Server.  When  the  Hidden 
Server  was  local  it  was  of  course  easy  to  find  a 

6This  result  is  for  every  test  running  without  the  use  of 
“helper”  guard  nodes.  Cf.,  Section|^] 

7Given  three  nodes  between  HS  and  RP,  we  would  expect 
to  find  a  common  predecessor  IP  address  in  only  about  33% 
of  matching  connections.  The  discrepancy  is  due  to  an  imple¬ 
mentation  feature  of  Tor  uncovered  by  our  experiments.  Cf., 
SectionP  1 1 1 


900-1 

750- 


0-1 - , - , - , - , - , - , 

1st  2nd  3rd  4th  5th  6th  7th 


Figure  4.  Average  round  trip  times  at 
seven  locations  in  a  sample  of  circuits 


match  showing  the  attacker’s  node  next  to  the  ser¬ 
vice  (order  of  100-1000  compared  to  the  other  con¬ 
nections).  But  even  when  our  service  was  located 
on  computers  on  the  other  side  of  the  globe  we 
could  still  statistically  observe  when  Alice  was  con¬ 
necting  directly  to  the  Hidden  Server.  The  round 
trip  times  were  an  order  of  two  or  more  larger  for 
the  nodes  not  adjacent  to  the  Hidden  Server,  as 
shown  in  Fig.  [4]  The  lower  line  represents  the  av¬ 
erage  response  times  for  52  samples  of  the  nodes 
closest  to  the  Hidden  Server,  and  the  upper  line  is 
for  the  other  35  samples  in  our  set  where  Alice  is 
located  at  Node  2  in  Fig.  [3  Due  to  the  previously 
mentioned  implementation  feature  of  Tor  we  were 
unable  to  find  data  when  Alice  is  located  as  Node  3, 
cf.  Sectionl5~TI 

4.6  Owning  the  Rendezvous  Point 

By  extending  adversary  resources  and  using  two 
nodes  in  the  network,  it  is  possible  for  Alice  to  run 
attacks  where  she  owns  the  Rendezvous  Point.  This 
will  significantly  enhance  the  attack. 

Only  knowing  RP’s  IP  address  will  give  the 
attacker  knowledge  of  when  she  is  the  last  node 
(Node  3  in  Fig.Q  in  the  circuit  out  from  the  Hidden 
Server.  Selection  of  the  Rendezvous  Point  is  done 
by  the  Client  and  enables  Alice  to  choose  one  of 
her  nodes  as  RP,  while  still  leaving  her  other  node 
free  to  be  chosen  by  HS  for  the  circuit  to  RP.  This 
allows  Alice  to  tell  when  she  is  the  second  to  last 
node  in  the  circuit  as  well  (since  both  C  and  RP  are 
connected  to  the  same  node).  This  implies  that  if 
the  path  length  is  three  before  connecting  to  HS  (as 
currently  implemented)  the  attacker  is  able  to  deter- 
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mine  the  instance  where  she  is  Node  1,  thus  directly 
revealing  the  IP  address  of  the  Hidden  Server.  The 
speed  and  accuracy  of  the  attack  is  then  greatly 
improved,  and  the  result  will  be  as  fast  as  in  the 
Service  Location  Attack — except  that  this  own-the- 
rendezvous  attack  will  identify  services  located  at 
network  servers  as  well  as  those  located  at  clients. 

5  Countermeasures:  More  Hidden 
Services 

5.1  Allowing  Middleman  nodes  to 
connect  to  Rendezvous  Points 

This  first  point  is  really  about  an  implementation 
feature  of  Tor’s  hidden  services  that  facilitates  our 
attacks  rather  than  a  limitation  of  the  hidden  ser¬ 
vices  system  design.  But  since  changing  the  feature 
does  slow  down  the  attacks,  we  list  it  here. 

To  save  time,  all  Tor  clients  (including  hidden 
servers)  establish  circuits  offline,  i.e.,  while  await¬ 
ing  service  requests.  Upon  receiving  a  rendezvous 
request  and  an  RP  location,  HS  extends  such  cir¬ 
cuits  to  RP.  Tor  clients  not  operating  as  hidden  ser¬ 
vices  typically  will  need  circuits  that  terminate  at 
nodes  that  allow  exit  from  the  Tor  network  on  com¬ 
mon  ports,  such  as  those  for  http,  ssh,  and  https. 
Almost  all  of  the  new  ’’stand-by”  circuits  estab¬ 
lished  thus  go  to  a  node  that  allows  such  exit,  which 
seems  quite  reasonable  considering  normal  client 
use.  A  hidden  server  should  similarly  always  have 
at  least  one  circuit  available  at  a  random  node  of  the 
network  ready  for  connection  to  the  Rendezvous 
Point. 

This  creates  an  advantage  for  our  attacker.  By 
running  in  middleman  mode  (never  allowing  cir¬ 
cuits  to  exit  the  Tor  network  at  that  node)  she  both 
reduces  the  overhead  of  running  a  node  and  guar¬ 
antees  that  whenever  her  network  node  is  used  be¬ 
tween  HS  and  RP,  it  will  almost  always  be  in  the 
first  or  second  position,  which  increases  the  effi¬ 
ciency  of  her  attack.  Our  experiments  uncovered 
this  undocumented  feature  of  the  Tor  implementa¬ 
tion.  It  is  a  trivial  change  to  allow  the  third  node 

8We  had  no  occurrence  of  being  Node  3  in  the  sample  sets 
described  in  this  paper. 


from  HS  to  RP  to  be  any  node  not  just  an  exit  node. 
This  has  now  been  implemented  by  the  Tor  devel¬ 
opers  and  is  available  in  the  latest  versions. 

5.2  Dummy  traffic 

In  anonymous  communication,  dummy  traffic 
is  a  countermeasure  to  traffic  analysis  that  is  of¬ 
ten  initially  suggested.  However,  dummy  traffic 
is  expensive,  and,  despite  research,  it  has  yet  to 
be  shown  that  dummy  traffic  defeats  any  active 
attacks  on  low-latency  systems  unless  the  system 
will  also  bring  most  or  all  of  the  network  to  a 
stop  in  response  to  one  non-sending  client  (as  in 
Pipenet  [ HI).  Since  this  makes  it  trivial  for  any 
user  to  bring  down  the  network,  it  is  generally  seen 
as  a  price  few  would  pay  for  anonymity,  which 
means  that  even  those  who  would  pay  it  would 
be  hiding  in  a  very  small  anonymity  set  mm- 
While  some  dummy  traffic  schemes  have  been  pro¬ 
posed  □  El,  that  attempt  to  address  some  active 
attacks,  no  fielded  low-latency  systems  currently 
use  dummy  traffic.  In  light  of  our  attacks,  we  de¬ 
scribe  why  dummy  traffic  would  be  an  especially 
ineffective  countermeasure  for  hidden  services. 

In  our  attack  scenario,  Alice  can  develop  a  list  of 
candidate  circuits  by  labeling  any  circuits  through 
her  network  node  that  show  a  response  from  the 
server  shortly  after  she  sends  a  request  with  an 
RP  address  to  HS.  This  would  potentially  include 
many  false  positives.  She  can  then  induce  a  tim¬ 
ing  signature  in  her  network  node  on  all  responses 
from  a  server  on  candidate  circuits.  This  can  be 
done  exactly  in  the  manner  of  the  timing  signa¬ 
ture  used  by  Murdoch  and  Danezis  01  ,  except  that 
our  attacks  do  not  require  collusion  by  the  exter¬ 
nal  server.  Alice’s  client  then  simply  looks  for  the 
same  timing  signature.  The  important  thing  to  note 
is  that  no  dummy  traffic  scheme  could  prevent  this. 
If  dummy  traffic  is  sent  all  the  way  to  the  client, 
Alice  can  of  course  detect  it  since  she  controls  the 
client.  If  dummy  traffic  is  sent  by  HS  to  some  node 
between  Alice’s  node  and  Alice’s  client,  this  will 
result  in  some  differences  between  the  induced  sig¬ 
nature  and  the  one  seen  by  the  client.  Nonetheless, 
for  low-latency  traffic  this  strong  signature  would 
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clearly  remain  easily  identifiable.  In  the  experi¬ 
ments  of  GO,  the  corrupt  server  would  send  for 
between  10  and  25  seconds  then  stop  sending  for 
between  30  and  75  seconds.  This  was  detected  in¬ 
directly  in  those  experiments  by  interference  with 
external  probes  of  Tor  nodes.  In  our  case,  Alice 
would  have  direct  access  to  the  circuits.  This  also 
means  that  the  attack  would  scale  to  current  net¬ 
work  sizes  and  beyond^  Note  also  that  no  existing 
dummy  scheme  would  even  affect  the  signature  of 
traffic  sent  from  Alice’s  client  to  the  Hidden  Server 
through  Alice’s  node.  While  this  traffic  is  typically 
much  lower  volume  it  can  still  be  used  to  identify 
the  circuit. 

5.3  Extending  the  Path  from  Hid¬ 
den  Server  to  Rendezvous  Point 

As  described  in  Section  14.61  and  illustrated  in 
Fig-0  if  Alice  owns  at  least  two  nodes  she  can  have 
her  client  name  one  of  them  as  the  Rendezvous 
Point.  If  her  other  node  is  chosen  by  HS  as  Node  2, 
then  she  will  be  able  to  confirm  this  immediately 
with  high  confidence  because  both  her  nodes  will 
be  connected  to  Node  3.  And  as  before,  if  Alice 
were  connected  as  Node  3,  then  she  would  also 
know  this.  This  means  that  Alice  can  easily  know 
when  she  has  a  circuit  match  being  Node  1,  which 
is  especially  significant  if  HS  is  configured  as  in  the 
server  scenario  of  Section  IP  However,  this  also 
means  that  Alice  can  more  quickly  abandon  circuits 
when  she  does  not  have  Node  1  position,  speeding 
up  the  attack.  It  also  allows  rapid  identification  of 
guard  nodes  (cf.,  Sectionl5~4l. 

A  simple  countermeasure  to  this  is  to  allow  HS 
to  extend  the  path  length,  l,  to  RP  by  one  hop.  The 
attacker  owning  the  Rendezvous  Point  will  now  be 
able  to  determine  when  she  is  located  as  Node  3  or 
Node  4,  but  unable  to  differentiate  between  the  po¬ 
sitions  1  and  2,  forcing  Alice  to  use  the  predecessor 
or  service  location  attack.  Extending  the  path  will 
also  slow  down  the  predecessor  attack  and  the  tim- 

9  The  attacks  of  lfl8l  required  probing  the  entire  network  and 
were  done  on  a  network  an  order  of  magnitude  smaller  than  the 
current  Tor  network.  It  is  an  open  question  whether  they  would 
scale  to  the  current  one. 


Hidden  Rendezvous 


Figure  5.  Use  of  Entry  Guard  Nodes 


ing  analysis  by  a  factor  of  l/l  since  that  is  the  fre¬ 
quency  with  which  Alice’s  node  will  be  chosen  as 
Node  1  within  the  matching  circuits.  So  this  coun¬ 
termeasure  only  causes  a  minor  effect  on  the  speed 
of  the  predecessor  attack,  and  has  no  effect  on  the 
location  attack. 

As  an  alternative,  we  could  allow  HS  to  choose 
RP.  This  would  be  a  minor  code  change  to  the  Tor 
hidden  services  protocol.  Whether  adding  a  node 
before  the  Rendezvous  Point  or  allowing  HS  to 
choose  RP,  this  would  also  seem  to  imply  a  longer 
path  between  client  and  HS  than  the  current  default 
Tor  HS  protocol,  i.e.,  seven  nodes  vs.  six.  This 
would  also  create  an  easier  attack  if  our  techniques 
were  used  to  locate  the  client,  which  would  then 
require  its  own  countermeasure. 

5.4  Using  Entry  Guard  Nodes 

All  of  our  attacks  rely  on  Alice  being  able  to 
force  the  Hidden  Server  to  create  new  circuits  until 
she  can  cause  it  to  create  a  circuit  that  first  connects 
directly  to  her  node.  What  if  this  could  never  hap¬ 
pen,  or  if  the  rotation  of  first  nodes  in  the  circuit 
were  slowed  down?  This  would  prevent  or  sub¬ 
stantially  slow  our  attacks.  This  is  the  motivation 
behind  entry  guard  nodes  (or  simply  entry  guards) 
a  concept  introduced  by  Wright  et  al.  ifeOlUl  That 
work  looked  at  attacks  on  various  anonymous  com¬ 
munication  systems,  but  it  did  not  consider  at  all 
the  specific  concerns  of  hidden  services.  The  basic 
idea  of  a  helper  node  in  1 30]  was  to  always  choose  a 
single  node  as  the  first  node  in  a  communication.  If 
this  is  compromised,  then  that  end  of  your  circuit  is 

l°Wright  et  al.  named  these  nodes  helper  nodes,  which  we 
have  found  to  be  a  too  general  expression. 
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Table  2.  Experimental  results  when  Hidden  Server  is  using  Entry  Guard  Nodes. 


Total  circuits 

Matched 

Largest 

Second 

Third 

completed 

circuits 

single  IP 

largest 

largest 

Test  1 

292 

8 

7 

1 

0 

Test  2 

106 

6 

5 

1 

0 

Test  3 

296 

13 

12 

1 

0 

Test  4 

292 

10 

4 

3 

3 

always  compromised.  However,  if  it  is  not  compro¬ 
mised,  then  the  attacks  we  have  described  cannot 
work  because  Alice  will  never  own  the  node  adja¬ 
cent  to  HS  on  the  rendezvous  circuit.  (For  a  circuit 
initiator  to  better  hide  the  responder,  Wright  et  al. 
also  considered  helper  nodes  as  the  last  node  in  the 
circuit  as  well  as  the  first.) 

Tor  design  has  long  allowed  a  specified  list  of 
entry  nodes  (and  exit  nodes)  which,  when  speci¬ 
fied,  will  require  all  circuits  to  enter  (resp.  exit)  the 
network  through  nodes  in  that  set,  as  illustrated  in 
Fig.0  This  can  be  set  as  either  a  preference  request 
or  as  a  strict  requirement  C23-  The  effectiveness  of 
using  these  as  entry  guard  nodes  to  counter  prede¬ 
cessor  attacks  is  noted  by  the  Tor  designers  as  an 
open  research  problem  MB  We  now  explore  the 
idea  of  using  entry  guard  nodes  specifically  to  im¬ 
prove  the  protection  of  hidden  servers. 

There  are  several  parameters  and  options  possi¬ 
ble  in  choosing  entry  guard  nodes.  The  first  param¬ 
eter  is  the  entry  guard  set  size,  i.e,  the  number  of 
nodes  that  HS  will  use  as  entry  guards.  The  smaller 
the  set,  the  less  risk  that  Alice  owns  a  node  in  it; 
however  the  greater  the  chance  that  all  the  nodes  in 
the  set  will  be  subjected  to  monitoring  or  unavail¬ 
able  from  either  failure  or  attack.  As  already  noted, 
entry  guard  nodes  may  be  preferred  or  strictly  re¬ 
quired.  As  a  refinement,  there  may  be  a  succes¬ 
sively  preferred  set  of  entry  guards.  There  may  be 
a  single  layer  of  entry  guard  nodes,  i.e.,  nodes  cho¬ 
sen  to  be  immediately  adjacent  to  HS,  or  they  may 
be  layered,  i.e.,  some  guard  nodes  are  chosen  to 
be  used  for  the  first  hop,  some  for  the  second,  and 
possibly  further.  Finally,  they  may  be  chosen  at 

1 1  It  is  also  possible  to  set  an  exit  node  preference  in  the  URL 
for  specific  HTTP  requests  1271. 


random  or  chosen  based  on  trust  or  performance. 

Each  of  these  choices  is  orthogonal,  so  each  of 
the  combinations  of  choice  will  lead  to  systems 
with  different  properties.  For  space,  we  will  limit 
discussion  to  some  of  the  more  salient  combina¬ 
tions. 

Choosing  a  small  set  of  entry  guard  nodes  that 
are  both  permanent  and  strictly  required  could  lead 
to  a  higher  percentage  of  service  failures,  either 
by  accident  or  by  design  (assuming  a  very  power¬ 
ful  adversary,  with  the  capability  to  DoS  all  entry 
guard  nodes).  If  a  permanent  set  is  simply  a  prefer¬ 
ence,  then  DoS  of  all  entry  guard  nodes  could  lead 
back  to  our  attacks  if  the  guard  nodes  can  be  kept 
down  long  enough.  Of  course  this  assumes  that  en¬ 
try  guards  can  be  identified  by  the  attacker.  We  ran 
our  attacks  on  a  hidden  server  that  had  chosen  three 
entry  guard  nodes. 

Experiment  -  Attacking  Entry  Guard  Nodes: 

Letting  the  Hidden  Service  use  three  permanent, 
preferred  entry  guards  we  found  that  these  nodes 
combined  represented  all  identified  connections 
through  Alice’s  node,  as  shown  in  Table  [2]  A 
quite  unexpected  result,  but  caused  by  the  imple¬ 
mentation  feature  in  Tor  described  earlier:  we  were 
never  Node  3,  only  Node  2  (Node  1  being  the  entry 
guard). 

As  in  our  previous  experiments,  identifying  the 
entry  guard  nodes  through  our  attacks  never  took 
more  than  a  few  hours. 

Backup  guard  nodes:  Suppose  there  is  a  short 
list  of  entry  guard  nodes  that  is  preferred  (e.g.,  three 
nodes)  and  a  longer  list  of  guard  nodes  to  be  used 
as  backup  (e.g.,  nine)  if  those  are  not  available.  If 
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track  of  when  a  node  enters  the  set  and  prefers  to 
choose  entry  nodes  for  circuits  that  were  in  the  set 
sooner.  Nodes  are  only  deleted  from  the  set  when 
they  have  been  unreachable  for  an  extended  period 
(currently  one  month). 


Figure  6.  Use  of  Layered  Entry  Guard 
Nodes 


the  adversary  has  the  capability  of  keeping  say  four 
nodes  at  a  time  offline,  then  it  can  cause  HS  to  use 
other  nodes  in  the  Node  1  position  than  those  on  the 
short  list.  But,  all  that  will  accomplish  is  causing 
HS  to  rotate  to  three  new  nodes  from  the  longer  list 
as  primary  guards.  Alice  can  cause  rotation  of  cir¬ 
cuits  but  only  through  a  still  relatively  small  set  of 
entry  guard  nodes,  and  this  only  through  sustained 
attacking.  We  can  however  make  it  more  difficult 
for  Alice  to  find  the  entry  guard  nodes  at  all  via  our 
attacks. 

Layering  guard  nodes:  Suppose,  e.g.,  that  HS 
has  a  set  of  three  entry  guard  nodes  from  which  to 
choose  Node  1,  and  for  each  of  these  has  a  set  of 
three  guard  nodes  from  which  to  choose  Node  2, 
as  illustrated  in  Fig.  [6]  As  before,  Alice  can  only 
successfully  find  HS  if  she  owns  one  of  these  three 
layer  1  entry  guard  nodes.  But  if  she  does  not,  in 
order  to  even  identify  one  of  those  layer  1  nodes  to 
attack  she  must  own  one  of  the  three  layer  2  guard 
nodes  associated  with  that  node. 

Layering  guard  nodes  would  require  much  more 
substantial  changes  to  the  Tor  code  than  the  exper¬ 
iments  we  have  already  run,  albeit  probably  fairly 
straightforward  changes.  We  have  thus  not  had  a 
chance  to  conduct  experiments  on  either  layering 
or  backup  guard  node  configurations.  However,  a 
version  of  backup  guard  nodes  has  recently  been 
implemented  in  Tor  in  response  to  our  results.  At 
the  time  of  writing,  by  default  each  client  running 
the  latest  Tor  code  chooses  three  nodes  as  initial 
preferred  entry  guards.  When  the  available  entry 
guard  set  shrinks  below  two  nodes,  two  more  nodes 
are  added  to  the  set.  However,  the  software  keeps 


Nonrandom  choice  of  entry  guard  node  sets: 

To  avoid  circuit  rotation  simply  from  failed  entry 
guard  nodes  it  might  seem  that  it  is  best  to  choose 
as  guard  nodes  those  that  have  the  best  uptime, 
and  perhaps  bandwidth.  This  is,  however,  subject 
to  abuse  since  adversaries  may  run  highly  reliable, 
highly  performing  nodes  in  order  to  increase  their 
chances  of  being  chosen  as  entry  guard  nodes.  And, 
this  is  especially  easy  to  abuse  in  the  current  Tor 
directory  statistics  in  which  nodes  report  their  own 
performance.  This  is  a  specific  instance  of  a  more 
general  problem  in  trying  to  build  reliable  anony¬ 
mous  communication.  One  possible  solution  is  to 
order  node  performance  and  reliability  but  then  to 
choose  from  a  large  enough  set  in  this  order  that 
the  adversary  is  unlikely  to  be  able  to  substantially 
alter  the  chances  of  being  chosen  as  an  entry  guard 
node.  Dingledine  and  Syverson  described  this  strat¬ 
egy  to  form  a  reliable  anonymous  communication 
network  of  mix  cascades  US- 

Another  possibility  is  to  choose  entry  guard 
nodes  based  on  trust  in  the  node  administrator.  It 
is  difficult  to  attach  probabilities  to  Alice’s  being 
trusted  by  the  Hidden  Server  administrator,  or  per¬ 
haps  more  likely,  to  compromise  a  node  run  by 
someone  trusted  by  the  Hidden  Server  administra¬ 
tor.  (Trust  in  honesty  should  not  be  confused  with 
trust  in  competence.)  Perhaps  a  greater  concern 
is  that  common  properties  of  the  administrators  of 
chosen  entry  guard  nodes  (e.g.,  they  are  all  family 
relatives)  may  lead  an  adversary  to  form  a  hypoth¬ 
esis  of  who  is  running  HS,  which  may  then  lead  to 
attacks  unrelated  to  use  of  the  network  per  se.  Here 
the  layering  approach  described  above  may  prove 
useful.  If  the  layer  1  nodes  are  personally  trusted, 
and  the  layer  2  nodes  are  chosen  as  random  sets, 
then  it  becomes  more  difficult  for  an  adversary  to 
discover  the  set  of  entry  guard  nodes  and  thus  to 
correlate  external  properties. 
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6  Conclusion 

Our  results  show  that  Tor’s  location-hidden 
servers  are  not  really  hidden — or  rather  they  were 
not  really  hidden  prior  to  the  recent  introduction 
of  guard  nodes  as  countermeasures  to  our  attacks. 
Using  our  attacks,  all  an  attacker  needed  was  one 
compromised  node  in  the  network  and  the  “Hidden 
Server”  was  identified. 

We  have  demonstrated  that  an  attack  with  one 
compromised  node  in  the  anonymity  network  takes 
only  minutes  if  the  service  is  located  at  a  client,  or  a 
couple  of  hours  when  located  on  a  server  node.  By 
using  two  nodes  in  the  network  it  only  takes  min¬ 
utes  to  find  the  Hidden  Server  regardless  of  where 
it  is  located. 

We  have  also  argued  that  neither  dummy  traf¬ 
fic  nor  extending  the  path  length  from  the  Hidden 
Server  to  the  Rendezvous  Point  will  protect  against 
all  of  our  attacks.  However,  requiring  hidden  ser¬ 
vices  to  always  use  entry  guard  nodes,  which  are 
currently  available  as  a  general  option  in  the  Tor 
code,  greatly  reduces  the  probability  of  successful 
attacks  against  a  hidden  service. 

Using  random  entry  guard  nodes  may  still  leave 
the  Hidden  Server  vulnerable  to  our  attacks  if  the 
attacker  is  powerful  enough  to  completely  deny  ser¬ 
vice  to  a  small  sets  of  nodes  or  to  compromise  them 
by  physical  or  other  means.  But,  using  backup 
guard  nodes  and/or  layering  guard  nodes  will  sig¬ 
nificantly  slow  down  even  such  an  attacker. 

Using  random  selection  of  backup  and  layering 
entry  guard  nodes  will  be  an  improvement,  but  as  in 
all  Tor  circuits,  someone  connecting  through  ran¬ 
dom  nodes  will  always  be  compromised  if  an  at¬ 
tacker  owns  just  two  nodes  Using  the  backup 
and  layering  techniques  in  combination  with  a  non- 
random  selection,  e.g.  based  on  some  kind  of  trust, 
or  experience,  with  the  nodes,  may  slow  the  attack 
even  more  or  may  even  prevent  it  entirely. 

We  have  demonstrated  attacks  that  surprisingly 
require  just  one  or  two  hostile  nodes.  What  is  pos¬ 
sible  by  an  adversary  that  controls  several  nodes, 
or  even,  e.g.,  two  percent  of  the  network?  We  will 
investigate  this  in  future  work.  Other  future  work 
includes  testing  implemented  countermeasures  for 


vulnerabilities  using  the  described  attack  scenarios, 
as  well  as  new  ones.  We  will  also  be  investigat¬ 
ing  improved  performance  by  shrinking  the  path 
length  between  the  Client  and  the  Hidden  Server. 
We  speculate  that  it  may  be  possible  to  do  so  with 
adequate  security  when  using  our  suggested  coun¬ 
termeasures,  and  possibly  others.  We  will  also  turn 
our  attack  on  its  head:  testing  to  locate  a  client  by 
having  an  attractive  hidden  service. 
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